Spatial Pyramid Matching

نویسنده

  • Svetlana Lazebnik
چکیده

This chapter deals with the problem of whole-image categorization. We may want to classify a photograph based on a high-level semantic attribute (e.g., indoor or outdoor), scene type (forest, street, office, etc.), or object category (car, face, etc.). Our philosophy is that such global image tasks can be approached in a holistic fashion: It should be possible to develop image representations that use low-level features to directly infer high-level semantic information about the scene without going through the intermediate step of segmenting the image into more “basic” semantic entities. For example, we should be able to recognize that an image contains a beach scene without first segmenting and identifying its separate components, such as sand, water, sky, or bathers. This philosophy is inspired by psychophysical and psychological evidence that people can recognize scenes by considering them in a “holistic” manner, while overlooking most of the details of the constituent objects (Oliva and Torralba, 2001). It has been shown that human subjects can perform high-level categorization tasks extremely rapidly and in the near absence of attention (Thorpe et al., 1996; Fei-Fei et al., 2002), which would most likely preclude any feedback or detailed analysis of individual parts of the scene. Renninger and Malik (2004) have proposed an orderless texture histogram model to replicate human performance on “pre-attentive” classification tasks. In the computer vision literature, more advanced orderless methods based on bags of features (Csurka et al., 2004) have recently demonstrated impressive levels of performance for image classification. These methods are simple and efficient, and they can be made robust to clutter, occlusion, viewpoint change, and even non-rigid deformations. Unfortunately, they completely disregard the spatial layout of the features in the image, and

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improved Spatial Pyramid Matching for Image Classification

Spatial analysis of salient feature points has been shown to be promising in image analysis and classification. In the past, spatial pyramid matching makes use of both of salient feature points and spatial multiresolution blocks to match between images. However, it is shown that different images or blocks can still have similar features using spatial pyramid matching. The analysis and matching ...

متن کامل

Application of sparse coding with spatial pyramid matching for face expression classification

In this work I evaluate the performance of image classification method based on spatial pyramid matching for sparse codes, using the JAFFE database of facial expressions. I show that the method is comparable to other methods, typically used for such datasets, and I also introduce some attempts that I made towards the improvement of the algorithm.

متن کامل

Image Classification Using Sparse Coding and Spatial Pyramid Matching

Recently, the Support Vector Machine (SVM) using Spatial Pyramid Matching (SPM) kernel has achieved remarkable successful in image classification. The classification accuracy can be improved further when combining the sparse coding with SPM. However, the existing methods give the same weight of patches of SPM at different levels. Clearly the discriminative powers of SPM at different levels are ...

متن کامل

Pyramid Person Matching Network for Person Re-identification

In this work, we present a deep convolutional pyramid person matching network (PPMN) with specially designed Pyramid Matching Module to address the problem of person reidentification. The architecture takes a pair of RGB images as input, and outputs a similiarity value indicating whether the two input images represent the same person or not. Based on deep convolutional neural networks, our appr...

متن کامل

Pyramid Stereo Matching Network

Recent work has shown that depth estimation from a stereo pair of images can be formulated as a supervised learning task to be resolved with convolutional neural networks (CNNs). However, current architectures rely on patch-based Siamese networks, lacking the means to exploit context information for finding correspondence in illposed regions. To tackle this problem, we propose PSMNet, a pyramid...

متن کامل

Combining Orientational Pooling Features for Scene Recognition

Scene recognition is a basic task towards image understanding. Spatial Pyramid Matching (SPM) has been shown to be an efficient solution for spatial context modeling. In this paper, we introduce an alternative approach, Orientational Pyramid Matching (OPM), for orientational context modeling. Our approach is motivated by the observation that the 3D orientations of objects are a crucial factor t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008